---
title: Remove Stopwords
description: Remove language-specific stopwords from the index
canonical: https://docs.paradedb.com/documentation/token-filters/stopwords
---

Stopwords are words that are so common or semantically insignificant in most contexts that they can be ignored during indexing.
In English, for example, stopwords include "a", "and", "or", etc.

All tokenizers besides the [literal](/documentation/tokenizers/available-tokenizers/literal) tokenizer can be configured to automatically remove stopwords
for a given language.

```sql
CREATE INDEX search_idx ON mock_items
USING bm25 (id, (description::pdb.simple('stopwords_language=english')))
WITH (key_field='id');
```

Valid languages are `danish`, `dutch`, `english`, `finnish`, `french`, `german`, `hungarian`, `italian`, `norwegian`, `portuguese`, `russian`, `spanish`, `swedish`.

To demonstrate this token filter, let's compare the output of the following two statements:

```sql
SELECT
  'The cat in the hat'::pdb.simple::text[],
  'The cat in the hat'::pdb.simple('stopwords_language=english')::text[];
```

```ini Expected Response
         text         |   text
----------------------+-----------
 {the,cat,in,the,hat} | {cat,hat}
(1 row)
```
